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I. INTRODUCTION 


This thesis investigates the following problem. Given 
the outlines of two objects, determine whether there are any 
regions where they have the same shape. It is implicit that 
one of objects may be partially occluded so that only a 
portion of it is available. Minimum restriction is placed on 
the class of objects to be matched. The objects may have 
closed or open boundaries (e.g., images of coastlines), with 
arbitrary scale and orientation. Furthermore, the matching 
must be done in the presence of noise (i.e., geometric 
distortions). An example of the type of shapes that will 


be studied in this report is given in Figure l.l. 





Figure 1.1 Two Shapes to be Matched 
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The shape of an object contains a great deal of informa- 
tion of the object. This is evident from our ability to 
recognise or at least guess at objects from their shapes 
alone. It is thus not surprising that the problem of shape 
description and recognition is fundamental in computer 
vision. 

Shape is, unfortunately, a largely qualitative concept. 
Although we possess intuitive ability for dealing with 


shape, we lack a good quantitative description. Shape is 
apparently implicit in our language, where the name of an 
object itself contains its shape structure. To appreciate 


this, consider Figure 1.2 (adapted from Freeman [Ref. 1]). 
Suppose one iS required to convey this figure to a distant 
friend, say over the telephone. How would one proceed? One 
could possibly spend a long time -describing it in terms of 
the ‘two peaks’, ‘left gentle slope', ‘right steep cliff', 
etc and yet at the end of it, still doubtful whether the 
message is brought across. Consider the alternative descrip- 
tion of ‘steep forehead, medium-sized nose, thin lips and a 
prominent chin'! (This is of course not just restricted to 
our perception of shape. We have the same difficulty with 
some of the other sensory perceptions too. Thus we speak of 
‘lemony taste' and ‘silky smoothness'.) The main problem in 
programming a machine to deal with shape lies largely in the 


need to ‘explicitize' shape. 


Researchers in this field have lamented that there is 
little guidance from the traditional mathematics [Ref. 2: p. 
229]. As pointed out by Blum [Ref. 3], geometry has its roots 
in surveying and has developed closely along with the phys- 
ical sciences. The general Cartesian view of geometry metri- 
cizes a space and describes a curve in that metric in some 
functional form. He observed that this constrained analysis 


to shapes of simple functional form rather than geometric 
Structure. 


ll 





Figure 1.2 A Sample Shape to be Described 


There has been extensive research on the subject of 
shape representation and recognition [Ref. 4]. Many ad hoc 
techniques have been developed, so that a large assortment 
of tools is now available for solving certain practical 
problems. And, as noted by Rosenfeld in his review paper 
[Ref. 5], the field has begun to develop a scientific basis. 
Recent developments in representation structures in mathe- 
matics have also allowed researchers to move away from the 
traditional framework of vector space (using classical math- 
ematical tools of analysis and linear space) to that of a 
structural framework (using modern tools such as graphs and 
grammars ). 

Applications of computer vision are wide and varied. 
These include character recognition, fingerprint identifica- 
tion, microscopy, radiology, robot vision, remote sensing 
and navigation, to name ae few. Many of the successful 
application of shape recognition have been primarily two- 
dimensional. The most general problem of recognition of a 
partially occluded three-dimensional object of unknown 
scale, orientation and aspect remains a research topic. 

This thesis is confined to two-dimensional shapes. It 
assumes that the outline of the object has been extracted 


and pre-processed to smoothen out some of the noise. Early 
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in this investigation, it was realised that our problem is 
two-fold. There is the representation problem and _ the 
matching problem (recognition and matching will be used 
interchangeably throughout this report). The representation 
problem is largely geometric in nature, whereas matching is 
primarily an algorithmic problem. However, the means of 
representation determines the complexity of the matching 
algorithm, and more importantly, it places a limit on the 
capability of the matching algorithm. Thus, a representa- 
tion based on Fourier Descriptors, for example, would not be 
able to handle the partial occlusion problem because of its 
global nature. 

The following chapter contains a survey of the various 
techniques that have been developed for the analysis of 
two-dimensional shapes. Chapter Three summarizes’7~ the 
initial findings of this investigation and introduces a new 
representation and matching algorithm. This representation 
scheme is both scale and orientation invariant. The 
matching algorithm is similar to the Hough Transform, but it 
has several distinct features that make it scale and orien- 
tation invariant too. Chapter Four presents the final 
results of this investigation - a new correlation technique 
that is simple and robust. This technique is applied to a 
number of test shapes and the results verify that it is 
capable of recognising parts of a shape. The shape may be of 
unknown scale and orientation. The ability to discriminate 
two different shapes is also demonstrated. The weakness of 
this techinque is also discussed. Finally, the last chapter 


Summarizes the key results obtained and offers suggestions 


for future work. 
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Ll. SURVEN 


A. INTRODUCTION 

The recognition of shape is a relatively old problem 
that has been recently taken up by engineers and computer 
Se rentists. Psychologists have long puzzled over the 
ability of humans and animals to discriminate shapes. A 
collection of very interesting papers on the early studies 
on form perception and discovery can be found in Uhr 
[Ref. 6]. The early experiments conducted had suggested that 
the information in an object outline is concentrated at 
those points having high curvature. This idea is in fact 
the basis for several of the current techniques for shape 
recognition [Ref. 7: p. 165]. 

This chapter contains a survey of the techniques devel- 
oped for two-dimensional shape recognition. It .2s “Tet 
intended to be a complete survey, but rather to be indica- 
tive of the variety of techniques that have been examined 
and also to demonstrate the difficulties facing researchers 
in this area. 

For convenience, these techniques are grouped into three 
categories, according to the matching scheme used. These 
are 

a. Template matching 
b. Feature matching 


c. Transform parameter matching 


B. TEMPLATE MATCHING 

Template matching is the oldest technique developed. 
This is basically a two-dimensional cross-correlation 
between the reference shape (the ‘template') and the test 
Shape. One may visualize template matching by imagining the 


template being shifted across the test shape to different 
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offsets and determining the amount of overlap. In its basic 
form, template matching is of limited use. 

Many variants to this basic method have been proposed. 
Most of these involve some sort of hierarchical template 
matching process. In this, sub-templates for parts of the 
objects are first matched. One then looks for combination 
of partial matches in approximately the correct relative 
positions. The computation cost is obviously high. Also, 
template matching breaks down when the two shapes to be 
matched are of different scales. 

The two-dimensional correlation can be converted to a 
one-dimensional correlation by coding the boundary in some 
appropriate functional form. Possible coding schemes 
include radius-angle representation, orientation-arc length 
representation, curvature-arc length representation. 

The radius-angle (or polar) representation requires a 
reference origin. This is usually takem to be the object's 
centroid. This representation is obviously scale-dependent. 
The need for a reference origin also makes it unsuitable for 
partially occluded objects and those with open boundaries. 
Also the need for the representation to be single-valued 
further restricts the type of shapes that can be coded in 
this manner. 

The orientation-arce length representation codes’ the 
angle made between a fixed axis and a tangent to the 
boundary as a function of the arc length. This representa- 
tion is scale invariant, but not orientation invariant. 
Straight horizontal lines in this representation correspond 
to zero curvature (ie. straight lines in the boundary), and 
straight non-horizontal lines correspond to Segments. of 
circle with the radii of curvature given by the slopes of 
the lines. (This allows the boundary to be easily segmented 
into straight lines and circular arcs and is used Sometimes 


in the initial processing for feature matching). 
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The curvature-arc length representation codes the curva- 


ture of the boundary as a function of are length. This 


representation is orientation invariant. Unfortunately it 
is not scale independent. (A circle of radius r, for 
example, has a curvature of l/r). Also, curvature is very 
sensitive to noise. Curvature 18S, however, a popular 


descriptor and this representation is often used to extract 
the extremas (in curvature) for feature matching [Ref. 8]. 

A discrete version of the orientation-are length repre- 
sentation has also been used. Commonly called the chain 
codes, this codes the boundary into short line segments that 
lie on a fixed grids with a fixed set of orientation. 
Although efficient in representation and cross-matching, 
chain codes are rather sensitive to noise and have other 
shortcomings that made this representation unsuitable for 
general shape matching.  [Ref. 9] 

one of the representation discussSed-above is simultane- 
ously scale and orientation invariant. The problems in 
obtaining a ‘truly intrinsic’ representation of the boundary 


1s further discussed in the next chapter. 


C. FEATURE MATCHING 

Another approach to shape matching is to construct a 
Structural model of the _ shape. This model describes the 
Spatial decomposition of a shape in terms of features or 
shape primitives. There are no established guidelines for 
choosing shape primitives; however it is desirable that 
these primitives provide a compact description of the shape 
and be easily extracted from the shape. 

A reading throught the literature reveals a wide variety 
of primitives that have been used. Most of these are based 
(explicitly or implicitly) on curvature. These include 
curvature maxima and minima, corners, protrusions, intru- 
Sion, linear segments, quadratic segments, circular arcs, 


convex blobs, T-shaped parts, ete. (see for example 
[Refs. 10,11]) 
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These primitives are often further qualified by a set of 
attributes, e.g., large, sharp convex corner facing North. 
Once the primitives are obtained, relationships between them 
are computed. Examples of these relationships are adja- 
cency, collinearity, Symmetry, etc. 

The matching algorithm depends on the type of structural 
model. There are essentially two kinds of structural 
models, the relational model and grammatical model  ([Ref. 12: 
pp. 426 to 434]. In relational model, the primitives appear 
as nodes in a tree or graph structure. Nodes are connected 
by their relationship. The matching algorithm typically 
involves a search for correspondence nodes in the two rela- 
tional structures to be matched. 

Grammatical model makes use of formal language theory to 
describe how the primitive pieces of the shape are joined 
together. A grammar consists of three types of entities: 
terminal or primitive symbols, non-terminal symbols and 
production rules. A grammar can be used to construct 
strings of primitive symbols (called a sentence) by succes- 
Sive application of the production rules. The set of all 
sentences that can be generated uSing a given grammar 1s 
call the language of the grammar. Object recognition 1S 
then a process of determining whether a sentence (which 
describe the object) belongs to a given language, by parsing 
it with respect to the grammar of the language. 

A major problem with the grammatical model is_ the 
construction of a grammar that is comprehensive enough to 
generate all the possible types of shapes of interest and 
yet discriminatory enough to reject others. A number of 
grammars have been developed over the years. A good 
description of these can be found in [Ref. 13: pp. 365 to 382]. 

A common problem with these relational and grammatical 
models is the effect of noise. Noise complicates’ the 


process of computing the appropriate structures. Ths. 21s 
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normally handled by preprocessing the shape _ boundary, 
usually by some sort of piecewise linear fit (polygonal 
approximation) [Ref. 14: p. 275]. Here one runs into the 
problem of how to locate the breakpoints, ie. when should a 
linear segment ends and a new segment begins [Ref. 2: p. 232]. 
A number of criteria have been proposed [Ref. 7: pp. 168 to 
184]. Recently the use of piecewise polynomial (mainly 
B-splines) has become popular. B-splines have a number of 
computational and representation advantages. For example, 
its ‘local’ characteristics and ‘terse' representation allow 
SrOerans to manipulate them easily [Ref. 2: p. 239]. As with 
plecewise linear approximation, B-splines approximation is 
also sensitive to the placement of breakpoints (knots). 

It is evident that within the structural framework, one 
gains a considerably greater representation freedom, but 
loses the convenience of vector space and the analytical 
tools there. The shape primitives and their relationships 
tend to be more qualitative than quantitative in deSscrip- 
tion. For example, a primitive like ‘sharp corner' does not 
carry numerical values of the degree of sharpness or the 
extent of the corner. Without a quantitative description, 
standard similarity measures such as least mean square 
differences cannot be easily applied. This also implies 
that the feature matching technique performs better in clas- 
Sifying shapes into their generic classes (those generated 
by the particular grammar) than in distinguishing between 
objects from the same class. 

This approach is highly suited for scene understanding 
application where a ‘literal', ie. qualitative, description 
of the scene can be built up and compared with another scene 
[Ref. 5]. It is of limited use in applications such as 
change detection, where detailed matching of specific bound- 
aries is required. This technique will not be further 


discussed in this report. 
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D. TRANSFORM PARAMETER MATCHING 

The above two classes of matching techniques operate on 
the original two-dimensional spatial information. Another 
approach is to transform the original data into a different 
domain and to perform the matching in this new domain. This 
method is no doubt motivated by the success of the frequency 
approach in electrical engineering analysis. It is thus not 
Surprising that the Fourier series representation of the 
parameterized boundary is one of the oldest and most popular 
transform technique. 

The boundary may be coded in any of the representation 
schemes discussed in the earlier section. These representa- 
tions are periodic, and can thus be expanded into a Fourier 
series. A common feature of the Fourier Descriptors (as 
these coefficients of the series are called) is that the 
general shape is given rather well by a few of the low-order 
terms (important for data compression applications). 
Properly parametrized, the coefficients can be made indepen- 
dent of scale and orientation [Ref. 2: p. 238]. 

However this description is global in nature, ie. each 
coefficient depends on every points on the boundary. It is 
therefore not suitable for matching partially occluded 
objects. Also, the Fourier descriptors can distinguish 
among symmetrical curves only on the basis of the phase of 
the descriptors. This, unfortunately, cannot be reliably 
computed in many cases. Thus, the descriptors of the 
contours of '2' and '5' are virtually identical [Ref. 4]. 

In contrast to the Fourier descriptors which describe 
the boundary, another transform technique, the method of 
moments, describes the shape interior points. In this tech- 
nique, coordinates of points belonging to the shape are used 
to compute a set of moments. These moments can be normal- 
ised to obtain measures that are invariant under scaling and 
rotation [Ref. 13: p. 354]. It is difficult to relate higher 
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moments to the shape, and furthermore, this is also a global 
transform; thus it is not suitable for partially hidden 
objects too. 

A new transform technique appeared in the literature 
recently [Ref. 15]. It treats a shape outline as a set of 
discrete data that is generated by an autoregressive model. 
An autoregresSive model iS a parametric equation that 
expresses each sample of an ordered set of data samples as a 
linear combination of a specified number of previous samples 
from the set plus an error term. This model is widely used 
in speech modelling and spectral estimation. The shape is 
then described by the model parameters. 

However, unlike conventional digital signal processing 
where the sample interval is determined physically (and 
uniquely) by an external reference (namely time), the 
samples obtained form a shape boundary is determined by the 
scale factor of the image of the object. It can be made 
scale independent if the samples are taken at fixed angular 
interval from, say, the centroid of the shape. The centroid 
is, however, a global feature, which then makes this scheme 
unsuitable for partially occluded objects. 

Another interesting transform technique makes use of 
geometric transformation to map instances of a given shape 
pattern into peaks of a transform space. This so-called 
Hough Transform was originally developed to handle simple 
shapes such as straight lines and circles, but it was 
recently extended to arbitrary shapes [Ref. 16]. We will 
describe this technique in some details as it will be the 
basis for a new matching algorithm to be developed in the 
next chapter. The description below is adapted from Ballard 
[Ref. 2: p. 128]. 

Consider an object with known scale and orientation. 
Pick a reference point (x,,y,) in the silhouette (see Figure 


2.1). At each boundary point (x;,y;), compute the gradient 
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direction (;) and the vector r. The magnitude of this 
vector is the length of the line joining the reference point 
to the boundary point and the direction is given by the 
angle between this line and the x-axis (q). Store r as a 
fumction of ;. This representation is multivalued, and in 
general an index »,; may have many values of r. The set of 
all such vectors indexed by qo; forms what is called the 
R-table. Table 1 shows the form of the R-table. 


(x2, Ve) 
| 





Figure 2.1 Hough Transform 


The R-table is used to detect instances of a shape as 
follows. First, an accumulator array of possible locations 


of the reference point A(x ) is initialised to zero. For 


cre 
each boundary point of the test shape, compute its gradient 
angle (@;). For each vector indexed by this angle in the 
R-table, compute the possible centers of the reference 


point. That is, for each table entry of »,;, compute 


Xo x; * r(; )*cos[a(,; )] 


wm ey, aero.) sania (eo, )] 
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Next, increment the accumulator array corresponding to this 


location, ie., 
AUGER Y cou! = CK AS op “7 Jt 


The peaks in the accumulator array then correspond to 


possible instances of the shape. 


TABLE 1 
R-TABLE 


Angle Measured from Boundary set of Vectors 
to Reference Point r = (r,q) 


91 | "2: aa 
2 A "220 Boa 





This technique can be summarised as follows. For the 
reference shape, code the boundary with respect to a fixed 
reference point. For the test shape, use this coding to 
reconstruct the possible locations of the reference point. 
A cluster of possible locations would be obtained. If the 
two shapes are identical, there would be a peak at the loca- 
tion of the original reference point. 

In this form, the Hough Transform has several limita- 
tions. It requires the reference and test objects to be of 
the same scale and orientation. Computational complexity 
increases rapidly if it is necessary to deal with variations 


in scale and orientation. Thus, to account for orientation, 


a2 


the above procedures must be repeated for every orientation 
to be distinguished. If it is required to distinguish 
orientation, say, 10 degrees apart, the procedures must be 
repeated 36 times, resulting in 36 accumulator arrays. The 
best match would then be identified by the accumulator with 
the largest value in all of the 36 arrays. Similarly with 
Scale variations. A more serious objection is that the 
transform suffers from false peaks in the accumulator array 
due to random matches. 

In the next chapter, it will be shown how with a 
different boundary representation scheme, this method can be 
modified to make it scale and orientation invariant. 
Chapter Four presents an improved version that also tends to 


decorrelate these random matches. 


E. CONCLUSION 

There exists a wide variety of techniques for shape 
representation and matching. However, each technique has 
its limitations and is restricted to its specific domain of 
shapes. The question naturally arises. Is there a scheme 
of representation and matching that is simultaneously scale 
and orientation invariant and also capable of handling 
partially occluded objects? We address this question in the 


next chapter. 
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IIL. PRELIMINARY FINDINGS 


A. IDEAL SHAPE REPRESENTATION 

The manner in which the shape boundary is represented 
determines to a large extent the capability and complexity 
of the matching algorithm. If the representation makes use 
of global information, then partial matching would not be 
possible. If the representation 1S not orientation invar- 
iant, then the matching algorithm would have to be repeated 
across the range of possible orientations. 

We can formulate a number of desirable characteristics 
that the ideal shape representation might possess (see also 
[Ref. 17]). These are: 


a. It should be local. By this we mean (i) the codaing of 
each point. on the boundary is determined ep a short 
e 


Section of the boundary, rather than by entire 
boundary, and (ii) the coding is not dependent on an 
externa reference, such as a centroid. 

b. It should be independent of the orientation and scale 


of the shape. 

c. It should be bounded. In other words, a small change 
to part. of the boundary should create a small local 
change in the representation. 


d. It should allow for efficient and robust matching in 
the presence of noise (geometric distortion). 


e. It should uniquely specified a single boundary (up to 
ete equivalence classes induced by scaling and rota- 

ion). 

f. It should contain  intommatiom aseuetoic boundat ae 
varying levels” of detail, so that the_= matc ine 
process could be performed at different levels o 
coarseness. 


g. It should be easily computable efficiently. 


These characteristics are ideal, and it is by no means 
obvious from the outset, that a representation with such 
characteristics could be found. Later in this chapter we 
Shall describe one scheme of representation and matching 


that comes close to satisfying these characteristics. 


24 


B. DIFFICULTIES IN REPRESENTATION 

For a representation to be scale and orientation invar- 
iant, it is necessary that it be local. Unfortunately, this 
is not a sufficient condition. It is necessary because if 
an external reference is used this must be related to the 
boundary, either in distance or direction. This immediately 
ties the representation to a fixed scale or orientation. 
That it is not sufficient can be seen from the fact that the 
curvature-arc length representation is local in nature, and 
yet is scale dependent. It is not obvious what the suffi- 
cient condition(s) is(are). Rather than look for these, the 
author concentrated on finding local representation that 1s 
both scale and rotation invariant. 

In a local representation, each point is influenced by a 
small section of the boundary. The question immediately 
arises. How to determine this section? It is obvious that 
the ‘extent’ of this section must be determined on a ‘local’ 
basis too. This ‘extent’ cannot be determined by factors 
such as ‘length’ or ‘number of points’ without making it 
scale dependent. 

The difficulties with shape representation can be traced 
to the basic fact that one cannot associate an absolute 
external reference with shape, as one could associate, say 
time, with radar signals. Shape is a spatial variation, and 
the spatial coordinates are, unfortunately, relative in 
nature. Radar signals, on other hand, is a temporal varia- 
tion, and for all practical purposes, time is an absolute 
coordinate; there is no ambiguity regarding the interval of 


time and the ‘direction’ of time. 


C. DIFFICULTIES IN MATCHING 

The primary problem with matching is our lack of knowl- 
edge on how to deal with geometric distortion (noise). 
Almost all forms of shape representation (boundary and 


Structural codings) are sensitive to geometric distortions. 
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As mentioned before, most researchers use some form of 
hierarchial schemes in the matching process. We could, for 
example, first find matches to small pieces (the smaller the 
pieces, the less the effect of distortion), then look for 
consistent combination of these matches. Alternatively, we 
could first find matches at low resolution (rough details) 
and then search for higher resolution matches in the 
vicinity of the lower resolution matches. . These hierar- 
chial schemes increase the matching complexity (more so if 
the representation is not scale and rotation invariant) and 
the computation cost. 

In contrast, conventional signal processing makes exten- 
Sive use of the statistical properties of the signal and 
noise in order to extract the signal. In shape recognition, 
we have very iittle understanding of the properties of 
geometric distortion (noise) and how this could be filtered 
out. There is little or no work done in this area. (It 
Should be added tthat it is also not obvious how this 
problem should be attacked). Most researchers concentrated 
on specific matching algorithm, using for the most parts, 
ad-hoc methods. 

A .second, more mundane, problem is concerned with corre- 
lation matching. Any representation that uses the are 
length as one of the coordinate has to content with the fact 
that both scale changes and geometric distortions (noise) 
affect the length of arc traversed during the coding. Thus 
even though the representation may be scale invariant, (in 
that the particular characteristics at each boundary point 
that is been coded does not vary with scale changes), the 
unknown factor in the arc length axis makes matching using 
correlation difficult. If the shapes to be matched are 
complete, then the scale factor could be possibly removed by 


normalizing with respect to the boundary length. 
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One simple algorithm to correlate scale and orientation 
invariant representations at different scaling in the arc 
length axis was devised. This algorithm basically builds up 
a diagram of correspondence points of the two curves to be 


matched. The algorithm is described below. 
Algorithm 1: Correlation Matching 


a. Set up an array, A(i,j) of dimension M by N where M,N 
are the number of points of curve 1 (denoted by f(n)) 
and curve 2 (denoted by g(n)). Initialize the array 


to zeros. 


b. For each point of. f(n), search through the points of 
e(n) for those points that match (to within a specif- 
ied tolerance). Change the corresponding array entry 


mo 1, ie., 


ele Sette, (i) = e(7) 


c. If the’array values are plotted (point for ‘'1', blank 
for '0'), a scatter diagram would result. Linear seg- 
ments in this diagram correspond to matched Segments 
of the two curves. The slopes and intercepts of these 
linear segments give the relative scale and orientat- 


ion of the matched segments of the two boundaries. 


An illustration of this can be seen in Figures 3.l, 3.2, 
3.3. Figure 3.1 shows the hypothetical boundary representa- 
tion of two shapes to be matched. It is assumed that these 
shapes have been coded in a representation scheme that is 
scale and orientation invariant. The two shapes differ in 
scale (as can be seen in their arc lengths) and orientation 
(as evident in the cyclical shift). There is also some 
distortion over a section of the boundary (points 1 to 60 in 
g(n)). Figure 3.2 shows the ‘scatter diagram' or correspon- 


dence chart. This is a very busy chart. (It is interesting 
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to note that linear segments having negative slopes also 
correspond to matched sections too; if both boundaries are 
traversed in the same direction, these matches are not mean- 
ingful, unless one of the object happens to be ‘reflected’ - 
mirror image). This chart can be ‘cleaned up' to filter out 
all but those points lying along the longest linear segment 
(with positive slope). This is shown in Figure 3.3. This 
figure shows that the segment from point 1 to about 120 of 
curve f(n) matches the segment corresponding to point 60 to 
150 of curve g(n). It indicates that there is a poorer 
match over the remaining segments. It also shows that the 
scale difference is 120/90, or 1.333, and that the two 
curves are displaced by about 60 points with respect to each 
other. 

The above algorithm basically performs an efficient 
correlation over a wide range of scale. The success of the 
algorithm depends largely on the sophistication of the 
"straight line finder' routine. 

In contrast to the correlation approach, the Hough 
Transform matching technique is not affected by arc length 
variation (in the sense that are length does not enter into 
its computation). This is because the Hough Transform does 
not make use of the ordered sequence information of the 
boundary points. This makes the Hough Transform sensitive 
to false peaks (random matches of unrelated points), but is 
also the reason why this technique is so much simpler. 
Correlation technique matches points of an ordered sequence 
of one curve against corresponding points of an ordered 
sequence of another curve. It is this need to keep the 
points ordered here that increases the computation burden in 


this technique. 


D. SCALE AND ORIENTATION INVARIANT REPRESENTATION 
It was obvious from the beginning that ‘angle informa- 


tion’ is scale and orientation invariant. The angle between 
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two straight lines remains unchanged regardless of the scale 
and rotation. It also became obvious, after searching for a 
while, that the arc length to chord length ratio between two 
points on the boundary (called the ACR henceforth for 
convenience) is also scale and orientation invariant. 

This suggests the following form of representation. 
Code each boundary point in terms of the angle made by the 
tangent to this point and a specific chord. This specific 
chord is the chord connecting the boundary point to the 
nearest boundary point (in a specific direction of trav- 
ersal) with the property that the ACR between these points 
is equaled to a pre-determined value. We shall call this the 
B - s representation. Figure 3.4 illustrates this. The 
curve is not closed to emphasis the fact that this coding 
scheme applies to both open and closed figures. 

Implementation of the § - s representation (for ACR 1.05 
and 1.3) on shapes R35-52, R34-3lp and R34-102 are given in 
Figures 3.5 and 3.6. Outlines of these shapes can be found 
in Figures 4.21 and 4.5. (For details of how these shapes 
are generated and the meaning behind their names, see the 
appendix. In Figure 3.5, the two curves have been properly 
scaled so that the difference in are lengths between them 
are removed. This allows for eaSy comparision. Figure 3.6 
has not been so scaled; the change in the arc length due to 
the noise is very evident here. 

It can be seen that the representation is virtually 
identical over identical portion of the original shapes. 
The partial match between R35-52 and R34-3lp is evident. 
Figure 3.6 shows the effect of noiSe on thiS representation. 
It can be seen that small perturbation in the boundary curve 
can cause disproportionately large changes in the represen- 
tation. This effect is localised to the neighbouring region 
only. Although not shown, it is obvious that this represen- 


tation is independent of orientation. 
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ACR = L/L 


Figure 3.4 Arce to Chord Length Ratio Illustration 


The ACR specification is a free parameter that can be 
adjusted. The larger the ACR, the larger will be the 
average distance between those points satisfying this ratio, 
ie the less ‘local’ the representation becomes. Also objects 
with relatively smooth boundaries would conceivably require 
a smaller ACR specification. The choice of an ‘optimum' ACR 
may be very shape-dependent. 

We note that the ACR specification is basically used to 


define the ‘extent’ of the small section of the boundary 


a8 
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Figure 3.6 B-s Representation for R35-52 and R34-102 
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discussed previously. This specification is both ‘local’ as 


well as scale and orientation invariant. This is by no means 


the only specification available. We can develop a whole 
family of them. Figure 3.7 illustrates two other possible 
Specifications. One uses the area to chord length squared 


ratio and the other uses the ratio between the length formed 
by the two tangents and the chord. 





Figure 3.7 Two Other Possible Specifications Besides ACR 


The sensitivity of this ACR specification is due to the 


unfortunate fact that geometric distortion affects the arc 
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length directly. Two points that originally satisfy the ACR 
specification in the coding phase may fail to do so in the 
matching phase if the segment of the boundary joining them 
is distorted. A small perturbation in the boundary can lead 


to a large change in the f£ coding. 


E. SCALE AND ORIENTATION INVARIANT HOUGH TRANSFORM 
Given the scale and orientation representation developed 
in the last section, we could use the ‘correspondence chart' 
algorithm to find possible matches. However, the particular 
mature of this representation allows us to use the simpler 
technique of Hough Transform with the additional simplicity 
that it is scale and rotation invariant. We shall call this 
the B - go correlation technique. The coding and matching 


algorithms (using the ACR specification) are given below. 
Algorithm 2: B-@ Coding 


a. Determine a reference line (usually taken to. be the 
x-axis for convenience). 

b. For each boundary point (s;), locate the next bound- 
ary point (s;) (in a specific direction of traversal) 
such that the ACR specification is met. 

c. Determine the angle (f§) between the chord joining sj; 
to s; and the tangent to s,. 
is positive if the segment of the shape bounded by 


The sign of this angle 


this points is convex, and negative otherwise. 

d. Determine the angle () between the chord and the 
reference line, meaSured clockwise from the reference 
line (see Figure 3.8). 

e. Determine other independent relation(s) between s,; 
and S;. For instance, the angle (qa) between the tan- 
gent lines to these boundary points. 

f. Code each boundary point in terms of the vector r, 
where r= (,a). Set up a R-Table relating 8B to 
(p,a). The Table is indexed by 8 (Table 2). 
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Algorithm 3: B-@ Matching 


a. Set up an accumulator A(i) of N elements, where N is 


the number of the (discretized) possible orientations 


of the reference line. (Thus N = 36, if each orient- 
ation is 10 degrees wide). Initialize accumulator to 
zeros. 


b. For each boundary point on the test shape, obtain B, 
and (@,@). 

c. For each pair of (O,a) indexed by B in the R Table, 
check if the independent relation matches. If it does, 
then determine the possible orientations (§) of the 
reference line from © and @. Increment the correspemggs 
ing element of the accumulator. If not, proceed on to 


the next boundary point. In other words, 


; ata la-a! < tolerance 
then : 
0=O- @ 
A(@) = A(@) + 1 
else 


next boundary point 


The peaks in the accumulator array then correspond to 
possible matches of the two shapes. The locations of the 
peak in the array indicates the most likely orientations of 
the reference line, and thus correspond to the relative 
orientations between the two shapes. 

For ease of future reference, we shall call the specifi- 
cation used to pair the two points (si,S;) as the primary 
Specification, and the additional specifications used to 
relate these points as the secondary specification(s). Also 
the pair of points (s;,8;) shall be called the coded pair. 
We shall use B to represent the coded information based on 
the primary specification, q to represent the further 


constraints based on the secondary specification(s) and 
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Figure 3.8 B-p Coding 


to represent the angle between the reference line and the 
chord joining the points in the coded pair. 

This technique differs from the basic Hough Transform in 
two essential ways. Firstly, this uses a reference line 
whose orientation is to be reconstructed, rather than a 
reference center whose coordinates have to be determined. 
Secondly, each boundary point is identified by §, which is 
local (referenced to the local tangent) rather than the 
gradient angle, which requires an external reference axis. 
These two differences make this matching technique scale and 
orientation invariant. Another distinction is the use of an 
independent relation (q). By only uSing those points that 
are Simultaneously related in both the § and q parameters, 
we reduce a fair portion of accidental matches. Of course 
we could use more independent relations to further restrict 


the possible match points. The limitation will be the 
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TABLE 2 
R-TABLE FOR 8 - ¢ CODING 


Angle between Chord Set of Vectors 
and Tangent to r= (,a) 
Boundary Point 


By Pay Ty2- +++ Sin 
Bo Toy. 922. 





- 


number of possible independent relations available (which 
must be scale and orientation invariant and relatively 
insensitive to noise). The tolerances set on these specifi- 
cations will determine the sensitivity to geometric distor- 
tiom The smaller the tolerance, the more sensitive it 
becomes. The tolerance must obviously be tighter for the 
primary specification than for the secondary specifications. 

This scheme is applied to shapes R35-52, R34-3lp and 
R34-102. The results, using 2 different values of ACR are 
shown in Figures 3.9, and 3.10. The accumulator values are 
normalised by dividing the values by the number of points on 
the test curve. (In all the examples in this report, the 
test curve is that given by the dashed line). These values 
can be easily interpreted as correlation coefficients. For 
example, Figure 3.10 indicates that at zero relative orien- 
tation of the 2 shapes, about 40% of the points in the test 


shape can be correlated with points in the reference shape. 
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Figure 3.9 Matching of R35-52 and R34-3lp Using 
B-p Correlation with ACR Specification 
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Figure 3.10 Matching of R35-52 and R34-102 Using 
B-@ Correlation with ACR Specification 
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By the nature of the coding this correlation is not 
point to point correlation, but rather point-on-a-segment to 
point-on-a-segment correlation; ie, the correlation is made 
on the basis of the behavior of the boundary in the vicinity 
of the point. Visually, we can see that the correlation 
should be higher than this. The low correlation is a direct 
consequence of the sensitivity of the ACR to geometric 
distortion. Both figures, however, correctly indicate that 
the best correlation between the shapes being tested occurs 
at zero degree relative orientation. 

To improve the correlation, we need to make the B-@ 
coding less sensitive to noise. This implies that we need 
alternative primary specification and, perhaps, secondary 
Specifications’ too. The other possible specifications 
mentioned earlier were tried and found to be unsuitable too. 

In the next chapter, we shall describe a new primary 
Specification that is less’ sensitive to the effects of 
noise. Using this, the resulting correlation between R35-52 
and R34-102 increases to 80% (see Figure 4.5). To do this we 
need to forgo the demand for scale and orientation invari- 
ance. However, the matching algorithm can be easily modi- 
fied to enable the algorithm to match shapes of arbitrary 


scale and orientation with a slight increase in computation. 
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IV. A NEW CORRELATION TECHNIQUE 


A. INTRODUCTION 

The alogrithm developed in the previous chapter is 
sensitive to noise. This is due to one main reason. We 
have removed the scale unknown by using the ACR measure; arc 
length is, unfortunately, very sensitive to geometric 
distortion. In other words, we have replaced an unknown 
factor with an uncertain measure. Thus, unless, we can find 
an alternative measure that is scale independent and reason- 
ably immuned to noise, this approach may be of limited prac- 
tical use. Such a measure was not found. 

We therefore remove the scale invariant constraint. 
What we eventually found is a new and interesting approach 
to boundary coding. In its essence, each boundary point is 
coded with respect to another point picked at random from 
the boundary. Note that this coding is not scaled and orien- 
tation invariant. In fact identical shapes would yield 
different codes if different sets of random numbers are 


used! 


B. RANDOM CODING 

We used aS primary specification, the random separation 
between the coded pairs. The property coded at each point 
is again §, the angle between the tangent to this point and 
the chord joining the coded pair. To retain the ‘local’ 
features (essential for partial match applications), the 
range of the allowable separation (called the coded range 
henceforth) is restricted. For illustrative purposes, 3 
sets of coding ranges are used in the examples below, namely 
10 to 60 points, 80 to 130 points and 150 to 200 points 
(i.e., the second element in the coded pair is picked from 
any point that lies between 10 to 60 points away from the 
first element, etc). 
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Two secondary specifications are used: the ACR and the 
angle made by the tangents to each point in the coded pair 
(ie. q in the previous algorithm). In the matching process, 
Since the points are paired randomly, it becomes necessary 
to check each point against all other points in the test 
shape. In practice, since the coding range is itself 
restricted, this process can be also restricted to a smaller 
section of the boundary. In the examples that follows, this 
search range is limited to half the entire boundary length. 
Further savings in computation is achieved by checking only 
alternate points within this range. 

The basic algorithm for this technique is similar to the 
previous one. For clarity, we shall restate it. Note that 
and » below refer to the same angles as in the previous 
algorithms, while q is used differently here. 


Algorithm 4: Improved §-@ Correlation 


a. For each boundary point inthe reference’ shape, 
select another boundary point at random from those 
within the allowable range. Determine the (f,9,a) 
relation between the coded pairs thus found. (Note: 
a contains two components, the ACR and the tangent 
angle measures). See Figure 4.1. 

b. Construct the R-Table in the same manner as before. 

c. Initialize the accumulator array as before. 

d. To match a test shape, determine for each boundary 
point, the (86,4) relation with all other boundary 
points within the search range. For each (8,050) and 
corresponding (8,9,a) from the R-Table, reconstructs 
the reference line as before. 

e. Peaks in the accumulator array correspond to possible 
matches of the two shapes with the location of the 


peaks corresponding to the relative orientation of 
the two shapes. 
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Coding Range 2 


Secondary Specifications: 


Qa - (gb) eee 





Figure 4.1 B-p Coding Using Random Separation 


We shall discuss the key features of this technique and 
provide heuristic explanations, where possible, on the 
‘hows’ and 'whys' of it. These features are verified in the 


numerous examples that follows. 


C. FEATURES 
l. Scale and Orientation Invariance 

The coding is not scale and orientation invariant. 
The scale unknown is resolved in the matching algorithm by 
pairing each point with all other points within the search 
range. This, in essence, performs a matching over a range of 
scale. The orientation unknown is not a problem, since the 
output of the matching process will indicate the relative 


orientation of the two shapes. The correlation is 
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performed, in essence, over the range of possible relative 
orientations. In this respect, this correlation technique is 
not affected by unknown scale and orientation and can be 
Said to be invariant to these. 

2. Robustness 

The random separation helps to ‘break’ down the 
effects of noise. Consider the alternative of using a fixed 
separation, say n. Then if the coded pair (Strrenrey) is 
affected by noise, the next pair (S;4,},.Sj+j+,) is likely to 
be similarly affected. However, if the separation iS 
random, and if (s3,8j;) is affected by noise, it is not 
necessary that (S;,1,S;,) (where j and k are randomly picked) 
would also be affected. More importantly, even if it ls, 
the effects in the two coded pairs are unlikely to be the 
Same, ie. the false matches they cause are not: likely to be 
pexrrelated. 

For the case of fixed Separation, because of the 
strong correlation (close proximity) between the _ coded 
pairs, noise in their coding are likely to be correlated, 
giving rise to false ‘peaks' during the process. This 
implies that in order to achieve the best decorrelation of 
false matches, the boundary should be coded such that the 
Parameter, §, is uniformly distributed across its range, 
-180 to +180 degrees. This may require extending the coding 
range to a substantial fraction of the entire boundary 
length, which may not be always desirable Since the coding 
then becomes less ‘local’ in nature. 

Another factor that helps to reduce the effects of 
noise is the nature of the matching algorithm. Figure 4.2 
illustrates this. The solid line there refers to portion of 
the reference shape and the dashed line to the test shape. 
Point Ss; is paired with s; during the original coding. In 


a) 


the matching algorithm, since o i Paired with ali other 


points, it would be eventually paired with one that is close 
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A 
geometrically to the original Sj (ie. S ; in Figure 4.2) and 
that also satisfy the secondary specifications. Thus, we 
would expect to recover the orientation of the reference 


line. 





Figure 4.2 Matching in the Presence of Noise 


3. "Local" Characteristics 
The choice of the coding range determines the amount 
of ‘local’ information captured in the coding. The lower 
the upper limit of the coding range, the more ‘local’ the 
representation becomes. If the coding range is the entire 
boundary, then the coding takes on a global nature. This 
will be clearly illustrated in the examples on Partial 
Matching below. 
4. Discrimination 
The distance of the coding range from the point 
being coded also determines the level of discrimination in 
the matching process. The closer this distance is, the 


smaller the segment the matching algorithm would be trying 
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to find matches. What is important here is the fact that 
small segments tend to look more similar than larger 
segments. Thus, a small segment from any curve would tend 
to look like a linear segment. Discrimination of two shapes 
cannot be reliably done at too small a scale. This also 
implies that the lower limit of the coding range should be 
as large as the longest linear segment of the shape, if the 
matching process is not to be overwhelmed by matches of 
short linear Segments. 

The algorithm uses the secondary specifications to 
rejects obvious false matches The types of discrimination 
possible with our choice of specifications is illustrated in 
Figure 4.3. If scale information is also available, then it 
can be effectively incorporated as an additional specifica- 
tion. An important observation is that the tolerances set on 
these specifications determine the ‘noise rejection thresh- 
old’. The larger the tolerance, the better the matching 
(detection probability) under noise; the higher too would be 
the amount of false matches (false alarms). The tolerances 
used in most of the examples below are QO.1 for the ACR 
measure and 5 degrees for the tangent angle measure. - 

The reader may wonder why do we use the ACR specifi- 
cation when it has been stated that this specification is 
too sensitive to geometric distortions. There is a distinc- 
tion between the role ACR play in the previous algorithm 
compared to the present. Previously it was used as a 
primary specification, whereas here it is used only as a 
confirmatory Specification; the tolerance on it is therefore 
looser here, making it less sensitive to noise. 

5. "End Losses" 

The ‘look forward' characteristics in the coding 
process means that the output matched segments tend to be 
Shorter than the actual match in the input segments. This 


is because the section ‘forward’ of the points being matched 
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Figure 4.3 Discrimination Using Secondary Specifications 


must itself matches before the ‘current’ segment can match. 
This will be clearly illustrated in the examples on Partial 
Matching too. The loss of the ‘forward ends' can be easily 


removed if the coding and matching are performed in both 
directions. 
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6. Lack of Internal Consistency Checks 
When matched segments of the shapes are found, the 


present algorithm simply counts the number of points in 
these segments and expresses this as a fraction of the 
number of points in the test boundary. It does not check to 
see if the relative positions of these segments in the test 


and reference shapes are consistent. This additional check 


Should eliminate false matches too. This is the main weak- 
ness of this technique. Such a check could be implemented 
(similar to those used in hierarchial search). It has not 


been done to keep this basic algorithm simple. 


D. RESULTS 

The algorithm is applied to numerous test shapes below. 
These examples verify the various comments made above. It 
1s hoped that the large number of test cases would give the 
Beager coniuidemee in the usesofethis new technique. In the 
examples, the number of points in the shapes are varied to 
ensure that any scale information that may be implicitly 
present are removed. AS a reminder, the Second number in the 
shape title indicates the number of points in that shape. 
Thus R35-52 has 500 points. Appendix A contains more details 
of these shapes. 

In the discussion and figures that follow, N refers to 
the number of sample points in the test shape, and RTOL and 
GTOL refer to the tolerances set in the ACR and tangent 
angle specifications respectively. One final note before we 
see the results. The direction of the orientation angle is 
as follows. A positive relative orientation of, say 90 
degrees means that the test shape (dash line) is rotated 90 
degrees counterclockwise from the reference shape. 

1. Geometric Distortion 

To study the sensitivity of this technique to noise, 
we introduce distortion at varying levels into the test 


shapes. Figures 4.4 to 4.7 show the results for one set of 
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Figure 4.4 Correlation Between R35-52 and R35-52 
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Figure 4.5 Correlation Between R35-52 and R34-102 
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Figure 4.6 Correlation Between R35-52 and R34-152 


54 


CORRELATION COEFFICIENT 





ms NOO=oc ama R34-252 


CODING RANGE: SEARCH RANGE: N/2 


230, oie 
RELATIVE ORIENTATION (ANGLE OF REFERENCE LINE) 


Figure 4.7 


Ts0s2 OCR RTOL > 0.100 
SOr Ss 0rrTS . GTOL ; 3 DEG 
1 OmmGOeRTS 





-30 rat 0 45 90 135 180 


Correlation Between R35-52 and R34-252 


55 


shapes. When the two shapes are identical, correlation is 
100% as expected (Figure 4.4). As the amount of distortion 
increases, the level of correlation decreases, until it 
reaches 604 for Figure 4.7. However the correlation level 
away from the peak value remains relatively constant, illus- 
trating the fact that matches at these orientations are 
random in nature. Note also that the lower coding range (10 
to 60 points) produces more apparent matches, since smaller 
sepments tends to match better than larger segments. The 
correlation peak occurs at the correct relative orientation, 
le. zero degree, since the two shapes are identically 
oriented. The result for Figure 4.5 should be compared 
against Figure 3.10 which uses ACR as the primary specifica- 
tion. This produces only 40% correlation between the two 
shapes. Using random coding,. the correlation has increased 
to 80%. 

The next figure, Figure 4.8, is -almost identical to 
Figure 4.4 despite the fact that the search range has been 
increased from N/2 to N-l. The fact that searching through 
a larger search range does not produce significantly more 
correlation attests to the ‘noise’ rejection capability of 
the algorithm. 

The algorithm is next applied to a set of more ‘dif- 
ficult' test shapes. Figures 4.9 to 4.12 show the correla- 
tion when the test shape is scaled down, rotated and 
distorted. In spite of the scale and orientation differ- 
ences, the algorithm correctly locates the match at 90 
degrees relative orientation. More significantly, the 
amount of correlation is not unreasonable compared to what 
one might estimate visually. For Figure 4.12, the distortion 
has more or less made the test shape symmetrical. It is thus 


not surprising for the algorithm to locate two peaks at plus 
and minus 90 degrees. 
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Figure 4.9 Correlation Between R35-52 and R32-0llr 
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Figure 4.11 Correlation Between R35-52 and R32-3lr 
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The amount of correlation is affected by the noise 
threshold set by the secondary specifications. If the toler- 
ances on these specifications (ie. GTOL and RTOL) are 
increased, the peak correlation can be seen to increase from 
about 30% to 50% (Figure 4.13). Inevitably, the amount of 
false matches increases too. 

Figures 4.14 to 4.20 provide further examples for 
different sets of shapes. The reference shape becomes 
progressively ‘smoother'. The general level of correlation 
is higher for these figures than for the previous set. This 
is due to the general symmetry and gross similarity between 
these shapes. Figure 4.20 provides the extreme case where 
the test shape is almost circular. Because of the symmetry, 
the correlation at all orientation is nearly constant. 
Also, since there is marked similarity between the test and 
reference shapes, this level of correlation is also very 
high. The reader may wonder about the ability of the algo- 
rithm to distinguish between very smooth shapes such as 
ellipses. This is further discussed under the section on 
Discrimination below. 

2. Partial atcnane 

Figure 4.21 shows the ability of the algorithm to 
detect partial matches. Except for the lowest coding range, 
the results show a distinct correlation peak at zero rela- 
tive orientation. The multiple peaks in the lowest coding 
range is due to the general similarity of shorter segments 
compared to longer segments. Figure 4.22 is a plot of the 
correlated points (for the 150 to 200 coding range). It 
Shows clearly the segment of partial match. Also, it shows 
that the correlated points at the other orientations are 
Scattered across the boundary. In obtaining the value of 
correlation, the algorithm simply sums up the number of 


correlated points at each orientation. 


62 


CORRELATION COEFFICIENT 





wes hionbz meme eie — Se] R 


CODING RANGE: SEARCH RANGE: N/2 


enie Z160) Pats RTOL ¢ O.bs0 
Bile) SGieiates GTOL ; 10 DES 


-. 
4\ 
oN 


~ 180 ~ 3 


Figure 4.13 


J 


/ 
/ 





O- 60 PTS 


aN 


/ 


HA r— 
J f \ fj ‘ \ 
! \ 
Ta VA 
fi f 4 
\ V ‘i \ 
\ \ 


te 


-90 -45 0 45 90 135 180 
RELATIVE ORIENTATION (ANGLE OF REFERENCE LINE) 


Correlation Between R35-52 and R32-5lr 


Using Relaxed Tolerances 


63 





CODING 


CORRELATION COEFFICIENT 


=180 “=hS 
RELATIVE ORIENTATION (ANGLE OF REFERENCE LINE) 


Figure 4.14 


ei eee = = Seino) 


RANGE: SEARCH RANGE: N/2 


TSOecUUnT To RTOL : 0.168 
GOqta0! PTS GTOL : 3S DEG 


ae TS ets) 
\ 





gs) = ic 0 45 90 lie) 180 


Correlation Between R44-52 and R43-31 


64 


CORRELATION COEFFICIENT 





— — Keleoac esto = Rae 2 


= 
CODING RANGE: . SEARCH RANGE: N/2 
150-200 PTS RTOL : 0.100 


a Oe (S07 F1S GTOL ; 5 DEG 
- - “TO> GMa : 


i i) 
7 a 
\/ : 
: ; 
\ 
\ f 


0.6 


a 


\ 


0.2 


0.0 





“Noe -\38 -$0 235) 0 mi) 30 is 180 
RELATIVE ORIENTATION (ANGLE OF REFERENCE LINE) 
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Figure 4.16 Correlation Between R14-52 and R14-51 
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Figure 4.19 Correlation Between R25-52 and R23-1ll1n 
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Figure 4.23 indicates the location of the matched 
segments for two coding ranges. The ability of the algo- 
rithm to correctly locate the matched segments is clearly 
illustrated. The two diagrams also show clearly the effects 
of ‘end losses'. At the 150 to 200 coding range, the ‘look 
forward’ section is much longer than for the 10 to 60 range. 
Consequently, the higher the loss of matched points at the 
forward end. As mentioned before, this loss could be minim- 
ised by modifying the coding and matching algorithm to look 
in both directions. 

Figures 4.24 to 4.26 show the effect of noise on 
partial matching. As before, the peak correlation decreases 
with noise while the off-peak level remains relatively 
constant. Note that the coding range 150 to 200 produces 
almost zero correlation. This is not surprising since the 
reference shape boundary has only 200 data points. At this 
coding range, almost the entire boundary is being coded at 
each point! This illustrates clearly the relationship 
between the coding range and the ‘local’ characteristics in 
the coding. For partial match applications, it is essential 
that the coding range be restricted to a short section of 
the boundary. Figure 4.27 shows the location of the partial 
match for the relative orientation -/5 degrees. 

Figures 4.28 shows the matching of a small section 
of a ‘wing’ to the reference shape R32-3lr. A good match 
is found at about -/75 degrees. Figure 4.29 shows the reverse 
Situation, where the reference shape is matched against the 
given wing. Possible matches are located at about 95 degrees 
and -105 degrees. The matched segment is indicated in 
Figure 4.30 (for orientation 95 degrees). These segments 
agree with our visual observation. 

Figures 4.31 tto 4.32 provide more examples of 
partial matches. Note that in all these, the location of the 
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Figure 4.26 Correlation Between R32-3lr and R34-22p 
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Figure 4.27 Matched Segments Between R32-3lr and R34-22p 


at -/5 Degrees Relative Orientation 
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Figure 4.28 Correlation Between R32-3lr and R33-22p 
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Figure 4.30 Matched Segments Between R33-22p and R32-1l1r 


at 95 Degrees Relative Orientation 
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Figure 4.32 Correlation Between R14-52 and R13-051p 
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peak correlation is correctly obtained. However, because of 
the general symmetry in the shapes, the general level of the 
correlation (away from the peak) is also Significant. If 
the ‘scatter’ of correlated points is taken into account, 
these false matches could possibly be reduced. The simplest 
way to do this would be to give different weightings to the 
correlated points depending on whether these are isolated 
points or are part of a continuous segment. 
3. Discrimination Capabl Bipy 

In this final section, we examine the discrimination 
capability of the algorithm. Figures 4.33 to 4.35 show the 
low correlation found when matching R35-52 against the other 
shapes. The next set of examples (Figures 4.36 to 4.39) 
show the discrimination between 'smoother' class of shapes. 
There is no prominent peaks in the correlation. However the 
general level of correlation is significantly higher because 
of the paises of the shape (smooth with plenty of linear 
segments). Consider Figure 4.39 for example. The large 
number of linear segments in both shapes gives rise to the 
high value of correlation between them. 

Figure 4.40 shows the location where partial match 
is found (at -175 degrees). This figure illustrates the 
main weakness of this algorithm; it does not check whether 
the relative positions of the matched segments in both the 
test and reference shapes are consistently related. In this 
particular example, different segments in the test shape 
have obviously been matched to the Same segment in the 
reference shape. To overcome this, one possible solution 
would be to usSe some sort of hierarchical matching scheme 
whereby the matched segments are first arranged according to 
their lengths and then checked for consistencies; beginning 
with the longest matched segment, and so on. 

The question of the ability to distinguish between 
highly symmetrical shapes such as ellipses has been raised 
earlier. Figures 4.41 and 4.42 show how the algorithm 
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matches ellipses of different major to minor axis ratio (b/a 
ratio). The b/a ratio for these ellipses are 1.5 for E3-152, 
2.0 for E3-22 and 0.3 for E2-031. The results shows that 
ellipse of b/a ratio 1.5 is better correlated with that of 
ratio 2.0 than with that of ratio 0.3 (or equivalently 3.33 


a/b ratio). This agrees with visual observation. 


E. CONCLUSIONS 

We have demonstrated the capability of this new tech- 
nique and the effects of varying the various parameters on 
its performance. The main weakness of this technique has 
also been highlighted. Although the examples used have been 
shapes with closed boundaries, there is nothing in the 
algorithm that is specific to this type of shapes. The 
algorithm is therefore equally applicable to shapes with 
open boundaries. 

The -algorithm is implemented on the IBM 3033 computer. 
Computation time depends on the shapes being matched. Shapes 
without distinct features (or, equivalently, with lots of 


Similar segments), such as R25-52, require the most computa- 


tion. On the average, the computation of one correlation 
curve between two 500-points shapes takes less than 10 CPU 
seconds. This is with a search range of N/2. If @his is 
reduced to N/3, this figure drops to about 6 seconds. [In 


our examples we have used a search range of N/2. This is 
probably larger than necessary since this implies that the 
coding range is as large as this. One is not likely to use 
this large a coding range since the ‘local’ features in the 
shape being coded would then not be captured. (The choice 
of N/2 for the examples is primarily to test the ability of 


the algorithm to reject spurious matches from the additional 
checks ). 
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V. SUMMARY 


We begun with a search for a representation scheme that 
would be scale and orientation invariant. Such a scheme was 
found. However, to achieve the scale invariance, the scheme 
required the local behaviour of the boundary to be rela- 
tively noise free. 

A more general technique was subsequently developed. 
The essence of this technique was the use of random boundary 
points in the coding, which helps to decorrelate false 
matches. The matching algorithm used the basic concept in 
Hough Transform matching but modified to remove its depen- 
dence on scale and orientation information. 

This new correlation technique was applied to a large 
number of shapes. Results verified its ability to recognise 
shapes (complete or partial) of arbitrary scale and orienta- 
tion and its robustness against noise. Its discrimination 
capability among different shapes was also demonstrated. 
The main weakness in the present algorithm lay in its 
simplistic way of summing up the correlated points without 
regards as to how these are distributed or interrelated. 

The biggest improvement to this algorithm would come 
from incorporating an efficient check for consistency in the 
relative positions of the matched segments. The coding and 
matching process could also be modified to look in both 
‘directions’, so as to reduce the ‘end losses’. Further 
study could also be made on the choice of the various param- 
eters used, namely the coding range, search range and 
tolerances on the secondary specifications. Since the 
reference shape would be a known entity, it would be 
possible, and indeed advantageous, to use different sets of 
parameters values for different classes of shapes, each 


optimised to the particular shape. In this report, we have 
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discussed one set of primary and secondary specifications. 
These may not be the most effective set available. Other 
possible specifications could also be examined. 

Finally we note that the main contribution of this study 
is the suggestion of an alternative means to boundary 
coding, using which, an effective and efficient correlation 


technique could be used to match two-dimensional shapes. 
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APPENDIX 
GENERATION OF TEST SHAPES 


The shapes used for verifying the algorithm (except for 
ellipses) are generated using a Fourier series type method. 


Specifically the x,y coordinates are determined by: 


x(8) = A(8)*cos (8+) 

y(6) = A(@)*sin(6+o) 
with 

A(6) 


exp[r(@)] 


r(8) = > ai*xsin[£,*9 * Yi] 


@ = angle through which 


; shape is rotated 


The a,y and f can be varied to produce different shape 


patterns. This method enSures that the figure generated is 
closed. The data points would, however, not be equally 
Spaced along the arc length. (In practice, the boundary 


data would be uniformly sampled). The data points are next 
approximated using a B-splines routine with variable knots 
[Ref. 18], and resampled at approximately equal arc length 
Spacing. 

There are two reasons for using B-splines. Firstly, the 
approximation routine available allows one to vary the 
closeness of fit, which enables us to introduce distortion 
gradually into the test shapes. Also, there has been an 
earlier proposal to study how the knots positions and the 
B-splines coefficients could be used for shape recognition 
purposes. (These was not carried out because of difficulties 
in the knots placement criteria; no satisfactory theoretical 
study on this has been done). 
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Each shape is coded with a mnemonic. Except for the 
ellipses, each mnemonic is prefixed with a R and has the 


general expression: 
Rnn-nnna 


where 'n' refers to a numeric and ‘a' refers to an alphabet. 
The first numeric identifies the set of shapes (1,2,3 or 4). 
The second numeric indicates the number of samples (in terms 
of hundreds). The last numeric refers to the relative scale 
(1 or 2). The remaining, which may be one or two digits 
numeric, indicate the closeness of fit used in the spline 
routine. The last alphabet is optional, and indicates addi- 


tional information about the shapes (p for partial, r for 


rotated and n for noise added). For examples, 
R35-252 
represents: 3 ---- shape set #3 


---- 500 sample points 


25 ---- closeness of fit factor is 25 
2 ---- relative scale of 2 
and 
R13-Olln 
represents: 
1 ---- shape set #1 
3 ---- 300 sample points 
Ol ---- closeness of fit factor is 0.1 


---- relative scale of l 
mn ---- noise added to portion 


of the boundary 


The ellipses are generated from their parametric equa- 
tions. These are prefixed by the letter E. The first 
numeric refers to the number of sample points. The last 
numeric indicates the relative scale and the remaining 


numeric refers to the major to minor axis ratio. 
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